Optimizing Sparse Matrix-vector Multiplication Based on Gpu
نویسندگان
چکیده
In recent years, Graphics Processing Units(GPUs) have attracted the attention of many application developers as powerful massively parallel system. Computer Unified Device Architecture (CUDA) as a general purpose parallel computing architecture makes GPUs an appealing choice to solve many complex computational problems in a more efficient way. Sparse Matrix-vector Multiplication(SpMV) algorithm is one of the most important scientific computing kernel algorithms. In this paper, we proposed new parallelization algorithms that CSR-M based on CSR format and ELLPACK-R based on ELLPACK format, which are realized the parallelism kernel on GPU with CUDA. We discussed implementing optimizing SpMV on GPUs using CUDA programming model, the optimization strategies including: mapping thread, mergering access, reusing data, avoiding branch, optimization thread block. The experiment results showed the proposed optimization strategies can improve performance, memory bandwidth and reduce the execution time of kernel.
منابع مشابه
Optimizing Sparse Matrix-Matrix Multiplication on a Heterogeneous CPU-GPU Platform
Sparse Matrix-Matrix multiplication (SpMM) is a fundamental operation over irregular data, which is widely used in graph algorithms, such as finding minimum spanning trees and shortest paths. In this work, we present a hybrid CPU and GPU-based parallel SpMM algorithm to improve the performance of SpMM. First, we improve data locality by element-wise multiplication. Second, we utilize the ordere...
متن کاملSparse Matrix-vector Multiplication on Nvidia Gpu
In this paper, we present our work on developing a new matrix format and a new sparse matrix-vector multiplication algorithm. The matrix format is HEC, which is a hybrid format. This matrix format is efficient for sparse matrix-vector multiplication and is friendly to preconditioner. Numerical experiments show that our sparse matrix-vector multiplication algorithm is efficient on
متن کاملA hybrid format for better performance of sparse matrix-vector multiplication on a GPU
In this paper, we present a new sparse matrix data format that leads to improved memory coalescing and more efficient sparse matrix-vector multiplication (SpMV) for a wide range of problems on high throughput architectures such as a graphics processing unit (GPU). The sparse matrix structure is constructed by sorting the rows based on the row length (defined as the number of non-zero elements i...
متن کاملA New Sparse Matrix Vector Multiplication GPU Algorithm Designed for Finite Element Problems
Recently, graphics processors (GPUs) have been increasingly leveraged in a variety of scientific computing applications. However, architectural differences between CPUs and GPUs necessitate the development of algorithms that take advantage of GPU hardware. As sparse matrix vector multiplication (SPMV) operations are commonly used in finite element analysis, a new SPMV algorithm and several vari...
متن کامل